202 research outputs found

    Less is More: Restricted Representations for Better Interpretability and Generalizability

    Get PDF
    Deep neural networks are prevalent in supervised learning for large amounts of tasks such as image classification, machine translation and even scientific discovery. Their success is often at the sacrifice of interpretability and generalizability. The increasing complexity of models and involvement of the pre-training process make the inexplicability more imminent. The outstanding performance when labeled data are abundant while prone to overfit when labeled data are limited demonstrates the difficulty of deep neural networks' generalizability to different datasets. This thesis aims to improve interpretability and generalizability by restricting representations. We choose to approach interpretability by focusing on attribution analysis to understand which features contribute to prediction on BERT, and to approach generalizability by focusing on effective methods in a low-data regime. We consider two strategies of restricting representations: (1) adding bottleneck, and (2) introducing compression. Given input x, suppose we want to learn y with the latent representation z (i.e. x→z→y), adding bottleneck means adding function R such that L(R(z)) < L(z) and introducing compression means adding function R so that L(R(y)) < L(y) where L refers to the number of bits. In other words, the restriction is added either in the middle of the pipeline or at the end of it. We first introduce how adding information bottleneck can help attribution analysis and apply it to investigate BERT's behavior on text classification in Chapter 3. We then extend this attribution method to analyze passage reranking in Chapter 4, where we conduct a detailed analysis to understand cross-layer and cross-passage behavior. Adding bottleneck can not only provide insight to understand deep neural networks but can also be used to increase generalizability. In Chapter 5, we demonstrate the equivalence between adding bottleneck and doing neural compression. We then leverage this finding with a framework called Non-Parametric learning by Compression with Latent Variables (NPC-LV), and show how optimizing neural compressors can be used in the non-parametric image classification with few labeled data. To further investigate how compression alone helps non-parametric learning without latent variables (NPC), we carry out experiments with a universal compressor gzip on text classification in Chapter 6. In Chapter 7, we elucidate methods of adopting the perspective of doing compression but without the actual process of compression using T5. Using experimental results in passage reranking, we show that our method is highly effective in a low-data regime when only one thousand query-passage pairs are available. In addition to the weakly supervised scenario, we also extend our method to large language models like GPT under almost no supervision --- in one-shot and zero-shot settings. The experiments show that without extra parameters or in-context learning, GPT can be used for semantic similarity, text classification, and text ranking and outperform strong baselines, which is presented in Chapter 8. The thesis proposes to tackle two big challenges in machine learning --- "interpretability" and "generalizability" through restricting representation. We provide both theoretical derivation and empirical results to show the effectiveness of using information-theoretic approaches. We not only design new algorithms but also provide numerous insights on why and how "compression" is so important in understanding deep neural networks and improving generalizability

    Do User Reviews Matter? Empirical Evidence on the Role of User Involvement in App Performance

    Get PDF
    The extant literature often presumed that user involvement was positively associated with software performance. In the context of mobile applications (apps), user reviews were collected to enlighten app developers on improvement of app quality through identifying bugs or suggesting new features. However, the value of user reviews varied a great deal due to their unmanageable volume and content irrelevance. In this study, over 40,000 user reviews with 50 apps were analyzed to empirically examine the association between customer led improvement and the revenues from the apps. Our findings indicated that customer led improvement produced significant increase in quarterly revenues. Greater growth in revenues was also observed if the developers responded to the user reviews faster. These results showed empirical support for the value of co-creation of apps with users, as customers could contribute to continuous improvement of the apps by providing experienced-based solutions

    Examining the Impact of User Reviews On Mobile Applications Development

    Get PDF
    User reviews were often collected to enlighten mobile applications (apps) developers on areas for improvement and novel features. However, users might not always possess the required technical expertise to make commercially feasible suggestions. The value of user reviews also varied due to their unmanageable volume and content irrelevance. In our study, over 40,000 user reviews with 50 apps would be analyzed using Python coding and regression analysis to examine the impacts of innovation and improvement led by users on apps performance in terms of revenues and user ratings. The developers’ lead time in responding to user reviews would be included as a moderator to investigate whether apps performance would be enhanced if developers respond faster. Our study should represent one of the first few attempts in offering empirical confirmation of the value of co-creation of apps with users

    A Novel Fusion Framework Based on Adaptive PCNN in NSCT Domain for Whole-Body PET and CT Images

    Get PDF
    The PET and CT fusion images, combining the anatomical and functional information, have important clinical meaning. This paper proposes a novel fusion framework based on adaptive pulse-coupled neural networks (PCNNs) in nonsubsampled contourlet transform (NSCT) domain for fusing whole-body PET and CT images. Firstly, the gradient average of each pixel is chosen as the linking strength of PCNN model to implement self-adaptability. Secondly, to improve the fusion performance, the novel sum-modified Laplacian (NSML) and energy of edge (EOE) are extracted as the external inputs of the PCNN models for low- and high-pass subbands, respectively. Lastly, the rule of max region energy is adopted as the fusion rule and different energy templates are employed in the low- and high-pass subbands. The experimental results on whole-body PET and CT data (239 slices contained by each modality) show that the proposed framework outperforms the other six methods in terms of the seven commonly used fusion performance metrics

    Analytics In Learning : From Consumer Learning to Organizational Learning

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    PaperRobot: Incremental Draft Generation of Scientific Ideas

    Full text link
    We present a PaperRobot who performs as an automatic research assistant by (1) conducting deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and contextual text attention; (3) incrementally writing some key elements of a new paper based on memory-attention networks: from the input title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. Turing Tests, where a biomedical domain expert is asked to compare a system output and a human-authored string, show PaperRobot generated abstracts, conclusion and future work sections, and new titles are chosen over human-written ones up to 30%, 24% and 12% of the time, respectively.Comment: 12 pages. Accepted by ACL 2019 Code and resource is available at https://github.com/EagleW/PaperRobo
    corecore